Load the necessary libraries
library(car)
library(rstanarm) #for fitting models in STAN
library(brms) #for fitting models in STAN
library(coda) #for diagnostics
library(bayesplot) #for diagnostics
library(ggmcmc) #for MCMC diagnostics
library(rstan) #for interfacing with STAN
library(emmeans) #for marginal means etc
library(DHARMa) #for residual diagnostics
library(broom) #for tidying outputs
library(tidybayes) #for more tidying outputs
library(ggeffects) #for partial plots
library(broom.mixed)#for summarising models
library(ggeffects) #for partial effects plots
library(tidyverse) #for data wrangling etc
Loyn (1987) modeled the abundance of forest birds with six predictor variables (patch area, distance to nearest patch, distance to nearest larger patch, grazing intensity, altitude and years since the patch had been isolated).
Regent honeyeater
Format of loyn.csv data file
| ABUND | DIST | LDIST | AREA | GRAZE | ALT | YR.ISOL |
|---|---|---|---|---|---|---|
| .. | .. | .. | .. | .. | .. | .. |
| ABUND | Abundance of forest birds in patch- response variable |
| DIST | Distance to nearest patch - predictor variable |
| LDIST | Distance to nearest larger patch - predictor variable |
| AREA | Size of the patch - predictor variable |
| GRAZE | Grazing intensity (1 to 5, representing light to heavy) - predictor variable |
| ALT | Altitude - predictor variable |
| YR.ISOL | Number of years since the patch was isolated - predictor variable |
The aim of the analysis is to investigate the effects of a range of predictors on the abundance of forest birds.
loyn = read_csv('../public/data/loyn.csv', trim_ws=TRUE)
glimpse(loyn)
## Rows: 56
## Columns: 7
## $ ABUND <dbl> 5.3, 2.0, 1.5, 17.1, 13.8, 14.1, 3.8, 2.2, 3.3, 3.0, 27.6, 1.…
## $ AREA <dbl> 0.1, 0.5, 0.5, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 2.0, 2.0, 2…
## $ YR.ISOL <dbl> 1968, 1920, 1900, 1966, 1918, 1965, 1955, 1920, 1965, 1900, 1…
## $ DIST <dbl> 39, 234, 104, 66, 246, 234, 467, 284, 156, 311, 66, 93, 39, 4…
## $ LDIST <dbl> 39, 234, 311, 66, 246, 285, 467, 1829, 156, 571, 332, 93, 39,…
## $ GRAZE <dbl> 2, 5, 5, 3, 5, 3, 5, 5, 4, 5, 3, 5, 2, 1, 5, 5, 3, 3, 3, 2, 2…
## $ ALT <dbl> 160, 60, 140, 160, 140, 130, 90, 60, 130, 130, 210, 160, 210,…
When we explored this analysis from a frequentist perspective, we decided on a log-normal like model. This was a model that was fit against a Gaussian distribution, yet with a log-link. We will replicate that model here in a Bayesian framework.
In the previous exploration of this model, we elected to treat Grazing intensity as a categorical variable - we will again code Grazing intensity as a categorical variable.
loyn = loyn %>% mutate(fGRAZE=factor(GRAZE))
Model formula: \[ y_i \sim{} \mathcal{N}(\mu_i, \sigma^2)\\ log(\mu_i) = \boldsymbol{\beta} \bf{X_i} \]
where \(\boldsymbol{\beta}\) is a vector of effects parameters and \(\bf{X}\) is a model matrix representing the additive effects of the scaled versions of distance (ln), distance to the nearest large patch (ln), patch area (ln), grazing intensity, year of isolation and altitude on the abundance of forest birds.
To re-acquaint ourselves with the data, I we will revisit the scatterplot matrix that we generated prior to the frequentist analysis.
scatterplotMatrix(~ABUND+DIST+LDIST+AREA+GRAZE+ALT+YR.ISOL, data=loyn,
diagonal = list(method='boxplot'))
We again notice that DIST, LDIST and AREA are skewed, so we will normalize them via a logarithmic transformation.
scatterplotMatrix(~ABUND+log(DIST)+log(LDIST)+log(AREA)+GRAZE+ALT+YR.ISOL, data=loyn,
diagonal = list(method='boxplot'))
loyn.lm<-lm(ABUND~scale(log(DIST))+scale(log(LDIST))+scale(log(AREA))+
fGRAZE + scale(ALT) + scale(YR.ISOL), data=loyn)
summary(loyn.lm)
##
## Call:
## lm(formula = ABUND ~ scale(log(DIST)) + scale(log(LDIST)) + scale(log(AREA)) +
## fGRAZE + scale(ALT) + scale(YR.ISOL), data = loyn)
##
## Residuals:
## Min 1Q Median 3Q Max
## -15.8992 -2.7245 -0.2772 2.7052 11.2811
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 22.47274 2.35785 9.531 1.83e-12 ***
## scale(log(DIST)) 0.13774 1.13704 0.121 0.9041
## scale(log(LDIST)) 0.45855 1.22886 0.373 0.7107
## scale(log(AREA)) 5.55123 1.22130 4.545 3.97e-05 ***
## fGRAZE2 0.52851 3.25221 0.163 0.8716
## fGRAZE3 0.06601 2.95871 0.022 0.9823
## fGRAZE4 -1.24877 3.19838 -0.390 0.6980
## fGRAZE5 -12.47309 4.77827 -2.610 0.0122 *
## scale(ALT) 0.46575 1.04041 0.448 0.6565
## scale(YR.ISOL) -0.32680 1.48462 -0.220 0.8267
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 6.105 on 46 degrees of freedom
## Multiple R-squared: 0.7295, Adjusted R-squared: 0.6766
## F-statistic: 13.78 on 9 and 46 DF, p-value: 2.115e-10
In rstanarm, the default priors are designed to be weakly informative. They are chosen to provide moderate regularlization (to help prevent overfitting) and help stabalise the computations.
loyn.rstanarm = stan_glm(ABUND ~ scale(log(DIST), scale=FALSE)+
scale(log(LDIST), scale=FALSE)+
scale(log(AREA), scale=FALSE)+
fGRAZE+
scale(ALT, scale=FALSE)+
scale(YR.ISOL, scale=FALSE),
data=loyn,
family=gaussian(link='log'),
iter = 5000, warmup = 1000,
chains = 3, thin = 5, refresh = 0)
Conclusions:
prior_summary(loyn.rstanarm)
## Priors for model 'loyn.rstanarm'
## ------
## Intercept (after predictors centered)
## Specified prior:
## ~ normal(location = 0, scale = 2.5)
## Adjusted prior:
## ~ normal(location = 0, scale = 27)
##
## Coefficients
## Specified prior:
## ~ normal(location = [0,0,0,...], scale = [2.5,2.5,2.5,...])
## Adjusted prior:
## ~ normal(location = [0,0,0,...], scale = [28.17,20.27,14.35,...])
##
## Auxiliary (sigma)
## Specified prior:
## ~ exponential(rate = 1)
## Adjusted prior:
## ~ exponential(rate = 0.093)
## ------
## See help('prior_summary.stanreg') for more details
Conclusions:
loyn.rstanarm1 <- update(loyn.rstanarm, prior_PD=TRUE)
ggemmeans(loyn.rstanarm1, ~AREA) %>% plot(add.data=TRUE) + scale_y_log10()
Conclusions:
I will also overlay the raw data for comparison.
loyn.rstanarm2 <- stan_glm(ABUND ~ scale(log(DIST))+
scale(log(LDIST))+
scale(log(AREA))+
fGRAZE+
scale(ALT)+
scale(YR.ISOL), data=loyn,
family=gaussian(link='log'),
prior_intercept = normal(3, 10, autoscale=FALSE),
prior = normal(0, 2.5, autoscale=FALSE),
prior_aux = cauchy(0, 5),
prior_PD=TRUE,
iter = 5000, thin=5,chains = 3, warmup=2000,
refresh=0)
Conclusions:
ggemmeans(loyn.rstanarm2, ~AREA) %>%
plot(add.data=TRUE) + scale_y_log10()
Now lets refit, conditioning on the data.
loyn.rstanarm3= update(loyn.rstanarm2, prior_PD=FALSE)
posterior_vs_prior(loyn.rstanarm3, color_by='vs', group_by=TRUE,
facet_args=list(scales='free_y'))
Conclusions:
#ggemmeans(loyn.rstanarm3, ~AREA) %>% plot(add.data=TRUE) + scale_y_log10()
In brms, the default priors are designed to be weakly informative. They are chosen to provide moderate regularlization (to help prevent overfitting) and help stabalise the computations.
Unlike rstanarm, brms models must be compiled before they start sampling. For most models, the compilation of the stan code takes around 45 seconds.
loyn.brm = brm(bf(ABUND ~ scale(log(DIST))+
scale(log(LDIST))+
scale(log(AREA))+
fGRAZE+
scale(ALT)+
scale(YR.ISOL),
family=lognormal()),
data=loyn,
iter = 5000, warmup = 1000,
chains = 3, thin = 5, refresh = 0)
prior_summary(loyn.brm)
## prior class coef group resp dpar nlpar bound source
## (flat) b default
## (flat) b fGRAZE2 (vectorized)
## (flat) b fGRAZE3 (vectorized)
## (flat) b fGRAZE4 (vectorized)
## (flat) b fGRAZE5 (vectorized)
## (flat) b scaleALT (vectorized)
## (flat) b scalelogAREA (vectorized)
## (flat) b scalelogDIST (vectorized)
## (flat) b scalelogLDIST (vectorized)
## (flat) b scaleYR.ISOL (vectorized)
## student_t(3, 3, 2.5) Intercept default
## student_t(3, 0, 2.5) sigma default
loyn.brm1 = brm(bf(ABUND ~ scale(log(DIST))+
scale(log(LDIST))+
scale(log(AREA))+
fGRAZE+
scale(ALT)+
scale(YR.ISOL),
family=gaussian(link='log')),
data=loyn,
prior=c(
prior(normal(0, 2.5), class='b')),
sample_prior = 'only',
iter = 5000, warmup = 1000,
chains = 3, thin = 5, refresh = 0)
ggemmeans(loyn.brm1, ~AREA) %>% plot(add.data=TRUE) + scale_y_log10()
conditional_effects(loyn.brm1) %>% plot(points=TRUE)
Conclusions:
loyn.brm2 = brm(bf(ABUND ~ scale(log(DIST))+
scale(log(LDIST))+
scale(log(AREA))+
fGRAZE+
scale(ALT)+
scale(YR.ISOL),
family=gaussian(link='log')),
data=loyn,
prior=c(
prior(normal(0, 10), class='Intercept'),
prior(normal(0, 2.5), class='b'),
prior(cauchy(0, 5), class='sigma')
),
sample_prior = 'only',
iter = 5000, warmup = 1000,
chains = 3, thin = 5, refresh = 0)
ggemmeans(loyn.brm2, ~AREA) %>%
plot(add.data=TRUE) + scale_y_log10()
loyn.brm3 <- update(loyn.brm2, sample_prior=TRUE, refresh=0)
loyn.brm3 %>% get_variables()
## [1] "b_Intercept" "b_scalelogDIST" "b_scalelogLDIST" "b_scalelogAREA"
## [5] "b_fGRAZE2" "b_fGRAZE3" "b_fGRAZE4" "b_fGRAZE5"
## [9] "b_scaleALT" "b_scaleYR.ISOL" "sigma" "prior_Intercept"
## [13] "prior_b" "prior_sigma" "lp__" "accept_stat__"
## [17] "stepsize__" "treedepth__" "n_leapfrog__" "divergent__"
## [21] "energy__"
loyn.brm3 %>%
posterior_samples %>%
select(-`lp__`) %>%
gather %>%
mutate(Type=ifelse(str_detect(key, 'prior'), 'Prior', 'b'),
Class=ifelse(str_detect(key, 'Intercept'), 'Intercept',
ifelse(str_detect(key, 'b'), 'b', 'sigma'))) %>%
ggplot(aes(x=Type, y=value)) +
stat_pointinterval()+
facet_wrap(~Class, scales='free')
standata(loyn.brm3)
## $N
## [1] 56
##
## $Y
## [1] 5.3 2.0 1.5 17.1 13.8 14.1 3.8 2.2 3.3 3.0 27.6 1.8 21.2 14.6 8.0
## [16] 3.5 29.0 2.9 24.3 19.4 24.4 5.0 15.8 25.3 19.5 20.9 16.3 18.8 19.9 13.0
## [31] 6.8 21.7 27.8 26.8 16.6 30.4 11.5 26.0 25.7 12.7 23.5 24.9 29.0 28.3 28.3
## [46] 32.0 37.7 39.6 29.6 31.0 34.4 27.3 30.5 33.0 29.5 30.9
##
## $K
## [1] 10
##
## $X
## Intercept scalelogDIST scalelogLDIST scalelogAREA fGRAZE2 fGRAZE3 fGRAZE4
## 1 1 -1.51019511 -1.65892116 -2.37802367 1 0 0
## 2 1 0.37029448 -0.30532977 -1.51765965 0 0 0
## 3 1 -0.48079400 -0.09042449 -1.51765965 0 0 0
## 4 1 -0.95804924 -1.26148217 -1.14712103 0 1 0
## 5 1 0.42278148 -0.26754922 -1.14712103 0 0 0
## 6 1 0.37029448 -0.15637842 -1.14712103 0 1 0
## 7 1 1.09552219 0.21669491 -1.14712103 0 0 0
## 8 1 0.57353754 1.24803687 -1.14712103 0 0 0
## 9 1 -0.05524976 -0.61163991 -1.14712103 0 0 1
## 10 1 0.66885367 0.36858640 -1.14712103 0 0 0
## 11 1 -0.95804924 -0.04106159 -0.77658241 0 1 0
## 12 1 -0.59812145 -1.00240327 -0.77658241 0 0 0
## 13 1 -1.51019511 -1.65892116 -0.77658241 1 0 0
## 14 1 0.93822292 0.10346964 -0.77658241 0 0 0
## 15 1 0.47682817 -0.22864597 -0.77658241 0 0 0
## 16 1 -0.24660010 0.43442972 -0.77658241 0 0 0
## 17 1 -1.93573935 -1.96523129 -0.55983122 0 1 0
## 18 1 -1.93573935 -1.96523129 -0.55983122 0 1 0
## 19 1 -1.48362354 -1.63979473 -0.40604380 0 1 0
## 20 1 0.47682817 -0.22864597 -0.40604380 1 0 0
## 21 1 0.37029448 0.29645165 -0.40604380 1 0 0
## 22 1 -1.93573935 1.38927509 -0.40604380 0 0 0
## 23 1 -1.51019511 -1.65892116 -0.28675700 0 1 0
## 24 1 0.85682391 0.04487798 -0.28675700 0 0 0
## 25 1 -0.59812145 -0.33160908 -0.18929260 0 1 0
## 26 1 -0.03525827 0.79868571 -0.18929260 0 1 0
## 27 1 0.57722655 0.69705984 -0.10688762 0 1 0
## 28 1 -0.22265561 -0.73213997 -0.10688762 0 0 1
## 29 1 0.50481706 -0.20849935 -0.03550518 0 0 1
## 30 1 0.79284436 1.26397616 0.02745860 0 0 0
## 31 1 0.75311975 0.29645165 0.08378161 0 1 0
## 32 1 -1.89613008 -1.93672022 0.13473198 0 0 1
## 33 1 -0.03525827 -0.59724988 0.18124602 0 0 1
## 34 1 -0.22265561 -0.73213997 0.18124602 0 0 1
## 35 1 -0.12477989 -0.66168825 0.22403479 1 0 0
## 36 1 -0.12477989 0.09591504 0.30053281 0 1 0
## 37 1 0.90372228 1.51230754 0.36744180 0 0 0
## 38 1 -1.48362354 1.66778539 0.39799721 1 0 0
## 39 1 0.50481706 0.99128245 0.42690016 0 0 1
## 40 1 0.66885367 1.53439756 0.50527059 0 0 0
## 41 1 1.35327186 0.40222517 0.59457340 0 0 0
## 42 1 1.25762760 0.59446805 0.65294853 0 1 0
## 43 1 0.24667868 -0.39430941 0.70557205 0 0 0
## 44 1 -0.95804924 -0.01204503 0.73798041 0 0 0
## 45 1 0.57722655 -0.51197475 0.82485885 1 0 0
## 46 1 -0.59812145 -1.00240327 0.87580921 0 1 0
## 47 1 0.47682817 0.98837574 0.92232325 0 1 0
## 48 1 2.26783778 1.12640241 0.93334579 0 0 0
## 49 1 0.92772762 1.07832552 0.94414564 0 0 0
## 50 1 1.09552219 0.21669491 1.01418997 0 0 0
## 51 1 -1.51019511 0.29645165 1.29286187 1 0 0
## 52 1 0.93822292 1.91565164 1.35582564 0 0 0
## 53 1 1.09552219 1.39201101 1.47113789 0 0 0
## 54 1 -0.12477989 -0.07123733 1.50961306 0 0 0
## 55 1 0.75311975 1.00336997 2.53095496 0 0 0
## 56 1 0.73743154 -0.04106159 2.85111978 0 0 0
## fGRAZE5 scaleALT scaleYR.ISOL
## 1 0 0.31588146 0.7134084
## 2 1 -1.98143824 -1.1629534
## 3 1 -0.14358248 -1.9447708
## 4 0 0.31588146 0.6352266
## 5 1 -0.14358248 -1.2411351
## 6 0 -0.37331445 0.5961358
## 7 1 -1.29224233 0.2052271
## 8 1 -1.98143824 -1.1629534
## 9 0 -0.37331445 0.5961358
## 10 1 -0.37331445 -1.9447708
## 11 0 1.46454131 -0.9284082
## 12 1 0.31588146 -2.3356795
## 13 0 1.46454131 0.9088627
## 14 0 1.46454131 0.8697719
## 15 1 -0.60304642 -1.9447708
## 16 1 -0.02871650 -1.9447708
## 17 0 -0.83277839 0.4788632
## 18 0 -0.14358248 0.5961358
## 19 0 1.00507737 0.4006814
## 20 0 -1.29224233 0.1270453
## 21 0 1.69427328 0.9088627
## 22 1 -0.60304642 -1.0456808
## 23 0 -0.37331445 0.5961358
## 24 0 -1.06251036 0.6743175
## 25 0 0.54561343 -2.3356795
## 26 0 0.08614949 0.4006814
## 27 0 -0.37331445 0.5961358
## 28 0 1.46454131 0.4006814
## 29 0 -0.60304642 0.9088627
## 30 1 -1.29224233 -1.5538621
## 31 0 -0.83277839 0.4788632
## 32 0 0.66047941 0.4006814
## 33 0 -0.83277839 0.5179540
## 34 0 1.23480934 0.4006814
## 35 0 1.00507737 0.7134084
## 36 0 -0.60304642 0.6352266
## 37 1 -1.06251036 -1.1629534
## 38 0 1.00507737 0.6352266
## 39 0 0.08614949 0.9088627
## 40 1 -1.29224233 -1.2411351
## 41 0 -0.14358248 0.5179540
## 42 0 -0.37331445 0.5961358
## 43 0 1.00507737 0.9479536
## 44 0 -0.83277839 0.5961358
## 45 0 -0.60304642 0.4788632
## 46 0 1.00507737 0.4006814
## 47 0 -0.60304642 -0.8502264
## 48 0 0.77534540 0.8697719
## 49 0 -0.14358248 0.6743175
## 50 0 0.43074744 0.5179540
## 51 0 0.66047941 1.0261353
## 52 0 -1.75170627 0.5570449
## 53 0 0.31588146 0.5570449
## 54 0 1.00507737 -0.3811360
## 55 0 1.00507737 0.7915901
## 56 0 2.61320116 -0.6547721
## attr(,"assign")
## [1] 0 1 2 3 4 4 4 4 5 6
## attr(,"contrasts")
## attr(,"contrasts")$fGRAZE
## 2 3 4 5
## 1 0 0 0 0
## 2 1 0 0 0
## 3 0 1 0 0
## 4 0 0 1 0
## 5 0 0 0 1
##
##
## $prior_only
## [1] 0
##
## attr(,"class")
## [1] "standata"
stancode(loyn.brm3)
## // generated with brms 2.14.0
## functions {
## }
## data {
## int<lower=1> N; // total number of observations
## vector[N] Y; // response variable
## int<lower=1> K; // number of population-level effects
## matrix[N, K] X; // population-level design matrix
## int prior_only; // should the likelihood be ignored?
## }
## transformed data {
## int Kc = K - 1;
## matrix[N, Kc] Xc; // centered version of X without an intercept
## vector[Kc] means_X; // column means of X before centering
## for (i in 2:K) {
## means_X[i - 1] = mean(X[, i]);
## Xc[, i - 1] = X[, i] - means_X[i - 1];
## }
## }
## parameters {
## vector[Kc] b; // population-level effects
## real Intercept; // temporary intercept for centered predictors
## real<lower=0> sigma; // residual SD
## }
## transformed parameters {
## }
## model {
## // likelihood including all constants
## if (!prior_only) {
## // initialize linear predictor term
## vector[N] mu = Intercept + Xc * b;
## for (n in 1:N) {
## // apply the inverse link function
## mu[n] = exp(mu[n]);
## }
## target += normal_lpdf(Y | mu, sigma);
## }
## // priors including all constants
## target += normal_lpdf(b | 0, 2);
## target += normal_lpdf(Intercept | 0, 5);
## target += cauchy_lpdf(sigma | 0, 5)
## - 1 * cauchy_lccdf(0 | 0, 5);
## }
## generated quantities {
## // actual population-level intercept
## real b_Intercept = Intercept - dot_product(means_X, b);
## // additionally draw samples from priors
## real prior_b = normal_rng(0,2);
## real prior_Intercept = normal_rng(0,5);
## real prior_sigma = cauchy_rng(0,5);
## // use rejection sampling for truncated priors
## while (prior_sigma < 0) {
## prior_sigma = cauchy_rng(0,5);
## }
## }
In addition to the regular model diagnostics checking, for Bayesian analyses, it is also necessary to explore the MCMC sampling diagnostics to be sure that the chains are well mixed and have converged on a stable posterior.
There are a wide variety of tests that range from the big picture, overal chain characteristics to the very specific detailed tests that allow the experienced modeller to drill down to the very fine details of the chain behaviour. Furthermore, there are a multitude of packages and approaches for exploring these diagnostics.
The bayesplot package offers a range of MCMC diagnostics as well as Posterior Probability Checks (PPC), all of which have a convenient plot() interface. Lets start with the MCMC diagnostics.
available_mcmc()
## bayesplot MCMC module:
## mcmc_acf
## mcmc_acf_bar
## mcmc_areas
## mcmc_areas_data
## mcmc_areas_ridges
## mcmc_areas_ridges_data
## mcmc_combo
## mcmc_dens
## mcmc_dens_chains
## mcmc_dens_chains_data
## mcmc_dens_overlay
## mcmc_hex
## mcmc_hist
## mcmc_hist_by_chain
## mcmc_intervals
## mcmc_intervals_data
## mcmc_neff
## mcmc_neff_data
## mcmc_neff_hist
## mcmc_nuts_acceptance
## mcmc_nuts_divergence
## mcmc_nuts_energy
## mcmc_nuts_stepsize
## mcmc_nuts_treedepth
## mcmc_pairs
## mcmc_parcoord
## mcmc_parcoord_data
## mcmc_rank_hist
## mcmc_rank_overlay
## mcmc_recover_hist
## mcmc_recover_intervals
## mcmc_recover_scatter
## mcmc_rhat
## mcmc_rhat_data
## mcmc_rhat_hist
## mcmc_scatter
## mcmc_trace
## mcmc_trace_data
## mcmc_trace_highlight
## mcmc_violin
Of these, we will focus on:
plot(loyn.rstanarm3, plotfun='mcmc_trace')
The chains appear well mixed and very similar
plot(loyn.rstanarm3, 'acf_bar')
There is no evidence of autocorrelation in the MCMC samples
plot(loyn.rstanarm3, 'rhat_hist')
All Rhat values are below 1.05, suggesting the chains have converged.
neff (number of effective samples): the ratio of the number of effective samples (those not rejected by the sampler) to the number of samples provides an indication of the effectiveness (and efficiency) of the MCMC sampler. Ratios that are less than 0.5 for a parameter suggest that the sampler spent considerable time in difficult areas of the sampling domain and rejected more than half of the samples (replacing them with the previous effective sample).
If the ratios are low, tightening the priors may help.
plot(loyn.rstanarm3, 'neff_hist')
Ratios all very high.
plot(loyn.rstanarm3, 'combo')
plot(loyn.rstanarm3, 'violin')
The rstan package offers a range of MCMC diagnostics. Lets start with the MCMC diagnostics.
Of these, we will focus on:
stan_trace(loyn.rstanarm3)
The chains appear well mixed and very similar
stan_ac(loyn.rstanarm3)
There is no evidence of autocorrelation in the MCMC samples
stan_rhat(loyn.rstanarm3)
All Rhat values are below 1.05, suggesting the chains have converged.
stan_ess (number of effective samples): the ratio of the number of effective samples (those not rejected by the sampler) to the number of samples provides an indication of the effectiveness (and efficiency) of the MCMC sampler. Ratios that are less than 0.5 for a parameter suggest that the sampler spent considerable time in difficult areas of the sampling domain and rejected more than half of the samples (replacing them with the previous effective sample).
If the ratios are low, tightening the priors may help.
stan_ess(loyn.rstanarm3)
Ratios all very high.
stan_dens(loyn.rstanarm3, separate_chains = TRUE)
The ggmean package also has a set of MCMC diagnostic functions. Lets start with the MCMC diagnostics.
Of these, we will focus on:
loyn.ggs <- ggs(loyn.rstanarm3)
ggs_traceplot(loyn.ggs)
The chains appear well mixed and very similar
ggs_autocorrelation(loyn.ggs)
There is no evidence of autocorrelation in the MCMC samples
ggs_Rhat(loyn.ggs)
All Rhat values are below 1.05, suggesting the chains have converged.
stan_ess (number of effective samples): the ratio of the number of effective samples (those not rejected by the sampler) to the number of samples provides an indication of the effectiveness (and efficiency) of the MCMC sampler. Ratios that are less than 0.5 for a parameter suggest that the sampler spent considerable time in difficult areas of the sampling domain and rejected more than half of the samples (replacing them with the previous effective sample).
If the ratios are low, tightening the priors may help.
ggs_effective(loyn.ggs)
Ratios all very high.
ggs_crosscorrelation(loyn.ggs)
ggs_grb(loyn.ggs)
The bayesplot package offers a range of MCMC diagnostics as well as Posterior Probability Checks (PPC), all of which have a convenient plot() interface. Lets start with the MCMC diagnostics.
available_mcmc()
## bayesplot MCMC module:
## mcmc_acf
## mcmc_acf_bar
## mcmc_areas
## mcmc_areas_data
## mcmc_areas_ridges
## mcmc_areas_ridges_data
## mcmc_combo
## mcmc_dens
## mcmc_dens_chains
## mcmc_dens_chains_data
## mcmc_dens_overlay
## mcmc_hex
## mcmc_hist
## mcmc_hist_by_chain
## mcmc_intervals
## mcmc_intervals_data
## mcmc_neff
## mcmc_neff_data
## mcmc_neff_hist
## mcmc_nuts_acceptance
## mcmc_nuts_divergence
## mcmc_nuts_energy
## mcmc_nuts_stepsize
## mcmc_nuts_treedepth
## mcmc_pairs
## mcmc_parcoord
## mcmc_parcoord_data
## mcmc_rank_hist
## mcmc_rank_overlay
## mcmc_recover_hist
## mcmc_recover_intervals
## mcmc_recover_scatter
## mcmc_rhat
## mcmc_rhat_data
## mcmc_rhat_hist
## mcmc_scatter
## mcmc_trace
## mcmc_trace_data
## mcmc_trace_highlight
## mcmc_violin
Of these, we will focus on:
mcmc_plot(loyn.brm3, type='trace')
The chains appear well mixed and very similar
mcmc_plot(loyn.brm3, type='acf_bar')
There is no evidence of autocorrelation in the MCMC samples
mcmc_plot(loyn.brm3, type='rhat_hist')
All Rhat values are below 1.05, suggesting the chains have converged.
neff (number of effective samples): the ratio of the number of effective samples (those not rejected by the sampler) to the number of samples provides an indication of the effectiveness (and efficiency) of the MCMC sampler. Ratios that are less than 0.5 for a parameter suggest that the sampler spent considerable time in difficult areas of the sampling domain and rejected more than half of the samples (replacing them with the previous effective sample).
If the ratios are low, tightening the priors may help.
mcmc_plot(loyn.brm3, type='neff_hist')
Ratios all very high.
mcmc_plot(loyn.brm3, type='combo')
mcmc_plot(loyn.brm3, type='violin')
The rstan package offers a range of MCMC diagnostics. Lets start with the MCMC diagnostics.
Of these, we will focus on:
stan_trace(loyn.brm3$fit)
The chains appear well mixed and very similar
stan_ac(loyn.brm3$fit)
There is no evidence of autocorrelation in the MCMC samples
stan_rhat(loyn.brm3$fit)
All Rhat values are below 1.05, suggesting the chains have converged.
stan_ess (number of effective samples): the ratio of the number of effective samples (those not rejected by the sampler) to the number of samples provides an indication of the effectiveness (and efficiency) of the MCMC sampler. Ratios that are less than 0.5 for a parameter suggest that the sampler spent considerable time in difficult areas of the sampling domain and rejected more than half of the samples (replacing them with the previous effective sample).
If the ratios are low, tightening the priors may help.
stan_ess(loyn.brm3$fit)
Ratios all very high.
stan_dens(loyn.brm3$fit, separate_chains = TRUE)
The ggmean package also has a set of MCMC diagnostic functions. Lets start with the MCMC diagnostics.
Of these, we will focus on:
loyn.ggs <- ggs(loyn.brm3)
ggs_traceplot(loyn.ggs)
The chains appear well mixed and very similar
ggs_autocorrelation(loyn.ggs)
There is no evidence of autocorrelation in the MCMC samples
ggs_Rhat(loyn.ggs)
All Rhat values are below 1.05, suggesting the chains have converged.
stan_ess (number of effective samples): the ratio of the number of effective samples (those not rejected by the sampler) to the number of samples provides an indication of the effectiveness (and efficiency) of the MCMC sampler. Ratios that are less than 0.5 for a parameter suggest that the sampler spent considerable time in difficult areas of the sampling domain and rejected more than half of the samples (replacing them with the previous effective sample).
If the ratios are low, tightening the priors may help.
ggs_effective(loyn.ggs)
Ratios all very high.
ggs_crosscorrelation(loyn.ggs)
ggs_grb(loyn.ggs)
Post predictive checks provide additional diagnostics about the fit of the model. Specifically, they provide a comparison between predictions drawn from the model and the observed data used to train the model.
available_ppc()
## bayesplot PPC module:
## ppc_bars
## ppc_bars_grouped
## ppc_boxplot
## ppc_data
## ppc_dens
## ppc_dens_overlay
## ppc_ecdf_overlay
## ppc_error_binned
## ppc_error_hist
## ppc_error_hist_grouped
## ppc_error_scatter
## ppc_error_scatter_avg
## ppc_error_scatter_avg_vs_x
## ppc_freqpoly
## ppc_freqpoly_grouped
## ppc_hist
## ppc_intervals
## ppc_intervals_data
## ppc_intervals_grouped
## ppc_loo_intervals
## ppc_loo_pit
## ppc_loo_pit_overlay
## ppc_loo_pit_qq
## ppc_loo_ribbon
## ppc_ribbon
## ppc_ribbon_data
## ppc_ribbon_grouped
## ppc_rootogram
## ppc_scatter
## ppc_scatter_avg
## ppc_scatter_avg_grouped
## ppc_stat
## ppc_stat_2d
## ppc_stat_freqpoly_grouped
## ppc_stat_grouped
## ppc_violin_grouped
pp_check(loyn.rstanarm3, plotfun='dens_overlay')
The model draws appear deviate from the observed data.
pp_check(loyn.rstanarm3, plotfun='error_scatter_avg')
The predictive error seems to be related to the predictor - the model performs poorest at higher mussel clump areas.
pp_check(loyn.rstanarm3, x=loyn$AREA, plotfun='error_scatter_avg_vs_x')
pp_check(loyn.rstanarm3, x=loyn$AREA, plotfun='intervals')
The modelled preditions seem to underestimate the uncertainty with increasing mussel clump area.
pp_check(loyn.rstanarm3, x=loyn$AREA, plotfun='ribbon')
The shinystan package allows the full suite of MCMC diagnostics and posterior predictive checks to be accessed via a web interface.
#library(shinystan)
#launch_shinystan(loyn.rstanarm3)
DHARMa residuals provide very useful diagnostics. Unfortunately, we cannot directly use the simulateResiduals() function to generate the simulated residuals. However, if we are willing to calculate some of the components ourself, we can still obtain the simulated residuals from the fitted stan model.
We need to supply:
preds <- posterior_predict(loyn.rstanarm3, nsamples=250, summary=FALSE)
loyn.resids <- createDHARMa(simulatedResponse = t(preds),
observedResponse = loyn$ABUND,
fittedPredictedResponse = apply(preds, 2, median),
integerResponse = FALSE)
plot(loyn.resids)
Conclusions:
Post predictive checks provide additional diagnostics about the fit of the model. Specifically, they provide a comparison between predictions drawn from the model and the observed data used to train the model.
available_ppc()
## bayesplot PPC module:
## ppc_bars
## ppc_bars_grouped
## ppc_boxplot
## ppc_data
## ppc_dens
## ppc_dens_overlay
## ppc_ecdf_overlay
## ppc_error_binned
## ppc_error_hist
## ppc_error_hist_grouped
## ppc_error_scatter
## ppc_error_scatter_avg
## ppc_error_scatter_avg_vs_x
## ppc_freqpoly
## ppc_freqpoly_grouped
## ppc_hist
## ppc_intervals
## ppc_intervals_data
## ppc_intervals_grouped
## ppc_loo_intervals
## ppc_loo_pit
## ppc_loo_pit_overlay
## ppc_loo_pit_qq
## ppc_loo_ribbon
## ppc_ribbon
## ppc_ribbon_data
## ppc_ribbon_grouped
## ppc_rootogram
## ppc_scatter
## ppc_scatter_avg
## ppc_scatter_avg_grouped
## ppc_stat
## ppc_stat_2d
## ppc_stat_freqpoly_grouped
## ppc_stat_grouped
## ppc_violin_grouped
pp_check(loyn.brm3, type='dens_overlay')
The model draws appear deviate from the observed data.
pp_check(loyn.brm3, type='error_scatter_avg')
The predictive error seems to be related to the predictor - the model performs poorest at higher mussel clump areas.
pp_check(loyn.brm3, x='AREA', type='error_scatter_avg_vs_x')
pp_check(loyn.brm3, x='AREA', type='intervals')
The modelled preditions seem to underestimate the uncertainty with increasing mussel clump area.
pp_check(loyn.brm3, x='AREA', type='ribbon')
The shinystan package allows the full suite of MCMC diagnostics and posterior predictive checks to be accessed via a web interface.
#library(shinystan)
#launch_shinystan(loyn.brm3)
DHARMa residuals provide very useful diagnostics. Unfortunately, we cannot directly use the simulateResiduals() function to generate the simulated residuals. However, if we are willing to calculate some of the components ourself, we can still obtain the simulated residuals from the fitted stan model.
We need to supply:
preds <- posterior_predict(loyn.brm3, nsamples=250, summary=FALSE)
loyn.resids <- createDHARMa(simulatedResponse = t(preds),
observedResponse = loyn$ABUND,
fittedPredictedResponse = apply(preds, 2, median),
integerResponse = FALSE)
plot(loyn.resids)
Conclusions:
loyn.rstanarm3 %>% ggpredict() %>% plot(add.data=TRUE, facet=TRUE)
#loyn.rstanarm3 %>% ggemmeans(~AREA, type='fixed') %>% plot(add.data=TRUE) + scale_y_log10()
loyn.rstanarm3 %>% fitted_draws(newdata=loyn) %>%
median_hdci() %>%
ggplot(aes(x=AREA, y=.value)) +
geom_ribbon(aes(ymin=.lower, ymax=.upper), fill='blue', alpha=0.3) +
geom_line() +
geom_point(data=loyn, aes(y=ABUND, x=AREA)) +
scale_y_log10()
loyn.brm3 %>%
conditional_effects() %>%
plot(ask=FALSE, points=TRUE) %>%
sjPlot::plot_grid()
#loyn.brm3 %>% conditional_effects(spaghetti=TRUE,nsamples=200)
loyn.brm3 %>% ggpredict() %>% plot(add.data=TRUE) %>%
sjPlot::plot_grid()
loyn.brm3 %>% ggemmeans(~AREA) %>% plot(add.data=TRUE)
loyn.brm3 %>% fitted_draws(newdata=loyn) %>%
median_hdci() %>%
ggplot(aes(x=AREA, y=.value)) +
geom_ribbon(aes(ymin=.lower, ymax=.upper), fill='blue', alpha=0.3) +
geom_line() +
geom_point(data=loyn, aes(y=ABUND, x=AREA)) +
scale_y_log10()
rstanarm captures the MCMC samples from stan within the returned list. There are numerous ways to retrieve and summarise these samples. The first three provide convenient numeric summaries from which you can draw conclusions, the last four provide ways of obtaining the full posteriors.
The summary() method generates simple summaries (mean, standard deviation as well as 10, 50 and 90 percentiles).
summary(loyn.rstanarm3)
##
## Model Info:
## function: stan_glm
## family: gaussian [log]
## formula: ABUND ~ scale(log(DIST)) + scale(log(LDIST)) + scale(log(AREA)) +
## fGRAZE + scale(ALT) + scale(YR.ISOL)
## algorithm: sampling
## sample: 1800 (posterior sample size)
## priors: see help('prior_summary')
## observations: 56
## predictors: 10
##
## Estimates:
## mean sd 10% 50% 90%
## (Intercept) 3.1 0.1 2.9 3.1 3.2
## scale(log(DIST)) 0.0 0.1 -0.1 0.0 0.1
## scale(log(LDIST)) 0.1 0.1 0.0 0.1 0.1
## scale(log(AREA)) 0.2 0.1 0.1 0.2 0.3
## fGRAZE2 0.0 0.2 -0.2 0.0 0.2
## fGRAZE3 0.0 0.1 -0.2 0.0 0.2
## fGRAZE4 0.0 0.2 -0.2 0.0 0.2
## fGRAZE5 -1.2 0.4 -1.7 -1.2 -0.7
## scale(ALT) 0.0 0.1 -0.1 0.0 0.1
## scale(YR.ISOL) 0.0 0.1 -0.1 0.0 0.1
## sigma 6.6 0.7 5.8 6.6 7.6
##
## Fit Diagnostics:
## mean sd 10% 50% 90%
## mean_PPD 19.2 1.2 17.6 19.2 20.8
##
## The mean_ppd is the sample average posterior predictive distribution of the outcome variable (for details see help('summary.stanreg')).
##
## MCMC diagnostics
## mcse Rhat n_eff
## (Intercept) 0.0 1.0 1711
## scale(log(DIST)) 0.0 1.0 1465
## scale(log(LDIST)) 0.0 1.0 1704
## scale(log(AREA)) 0.0 1.0 1650
## fGRAZE2 0.0 1.0 1689
## fGRAZE3 0.0 1.0 1736
## fGRAZE4 0.0 1.0 1561
## fGRAZE5 0.0 1.0 1621
## scale(ALT) 0.0 1.0 1413
## scale(YR.ISOL) 0.0 1.0 1590
## sigma 0.0 1.0 1710
## mean_PPD 0.0 1.0 1724
## log-posterior 0.1 1.0 1574
##
## For each parameter, mcse is Monte Carlo standard error, n_eff is a crude measure of effective sample size, and Rhat is the potential scale reduction factor on split chains (at convergence Rhat=1).
Conclusions:
DIST (rate at which the abundance of birds changes per 1 unit change in log-transformed Distance to nearest patch holding other continuous predictors constant and grazing intensity at level 1), is -0.02 (mean) or -0.09 (median) with a standard deviation of 0. The 90% credibility intervals indicate that we are 90% confident that the slope is between -0.02 and -0.02 - e.g. there is no evidence of a significant trend.tidyMCMC(loyn.rstanarm3$stanfit, estimate.method='median', conf.int=TRUE, conf.method='HPDinterval', rhat=TRUE, ess=TRUE)
Conclusions:
DIST (rate at which the abundance of birds changes per 1 unit change in log-transformed Distance to nearest patch holding other continuous predictors constant and grazing intensity at level 1), is NA (mean) or -0.14 with a standard deviation of -0.02. The 90% credibility intervals indicate that we are 90% confident that the slope is between NA and 0.1 - e.g. there is no evidence of a significant trend.loyn.rstanarm3 %>% get_variables()
## [1] "(Intercept)" "scale(log(DIST))" "scale(log(LDIST))"
## [4] "scale(log(AREA))" "fGRAZE2" "fGRAZE3"
## [7] "fGRAZE4" "fGRAZE5" "scale(ALT)"
## [10] "scale(YR.ISOL)" "sigma" "accept_stat__"
## [13] "stepsize__" "treedepth__" "n_leapfrog__"
## [16] "divergent__" "energy__"
loyn.draw <- loyn.rstanarm3 %>% gather_draws(`.Intercept.*|.*AREA.*|.*DIST.*|.*GRAZE.*|.*ALT.*|.*YR.*`, regex=TRUE)
loyn.draw
loyn.rstanarm3 %>% plot(plotfun='mcmc_intervals')
loyn.rstanarm3 %>% tidy_draws()
loyn.rstanarm3 %>% spread_draws(`.Intercept.*|.*DIST.*|.*AREA.*|.*GRAZE.*|.*ALT.*|.*YR.*`, regex=TRUE)
loyn.rstanarm3 %>% posterior_samples() %>% as.tibble()
loyn.rstanarm3 %>% bayes_R2() %>% median_hdci
rstanarm captures the MCMC samples from stan within the returned list. There are numerous ways to retrieve and summarise these samples. The first three provide convenient numeric summaries from which you can draw conclusions, the last four provide ways of obtaining the full posteriors.
The summary() method generates simple summaries (mean, standard deviation as well as 10, 50 and 90 percentiles).
summary(loyn.brm3)
## Family: gaussian
## Links: mu = log; sigma = identity
## Formula: ABUND ~ scale(log(DIST)) + scale(log(LDIST)) + scale(log(AREA)) + fGRAZE + scale(ALT) + scale(YR.ISOL)
## Data: loyn (Number of observations: 56)
## Samples: 3 chains, each with iter = 5000; warmup = 1000; thin = 5;
## total post-warmup samples = 2400
##
## Population-Level Effects:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## Intercept 3.06 0.12 2.81 3.28 1.00 2378 2365
## scalelogDIST -0.01 0.06 -0.13 0.11 1.00 2504 2161
## scalelogLDIST 0.05 0.06 -0.08 0.17 1.00 2485 2204
## scalelogAREA 0.21 0.06 0.09 0.34 1.00 2464 2410
## fGRAZE2 0.03 0.16 -0.29 0.35 1.00 2283 2146
## fGRAZE3 0.03 0.15 -0.28 0.33 1.00 2397 2225
## fGRAZE4 -0.00 0.17 -0.35 0.32 1.00 2329 2251
## fGRAZE5 -1.19 0.43 -2.09 -0.47 1.00 2391 2027
## scaleALT -0.00 0.05 -0.10 0.10 1.00 2511 2066
## scaleYR.ISOL -0.01 0.08 -0.15 0.16 1.00 2382 2274
##
## Family Specific Parameters:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sigma 6.62 0.70 5.41 8.18 1.00 2439 2290
##
## Samples were drawn using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).
Conclusions:
DIST (rate at which the abundance of birds changes per 1 unit change in log-transformed Distance to nearest patch holding other continuous predictors constant and grazing intensity at level 1), is -0.01 (mean) or 0.11 (median) with a standard deviation of 0.06. The 90% credibility intervals indicate that we are 90% confident that the slope is between -0.01 and 1 - e.g. there is no evidence of a significant trend.tidyMCMC(loyn.brm3$fit, estimate.method='median', conf.int=TRUE, conf.method='HPDinterval', rhat=TRUE, ess=TRUE)
Conclusions:
DIST (rate at which the abundance of birds changes per 1 unit change in log-transformed Distance to nearest patch holding other continuous predictors constant and grazing intensity at level 1), is NA (mean) or -0.12 with a standard deviation of -0.01. The 90% credibility intervals indicate that we are 90% confident that the slope is between NA and 0.11 - e.g. there is no evidence of a significant trend.loyn.brm3 %>% get_variables()
## [1] "b_Intercept" "b_scalelogDIST" "b_scalelogLDIST" "b_scalelogAREA"
## [5] "b_fGRAZE2" "b_fGRAZE3" "b_fGRAZE4" "b_fGRAZE5"
## [9] "b_scaleALT" "b_scaleYR.ISOL" "sigma" "prior_Intercept"
## [13] "prior_b" "prior_sigma" "lp__" "accept_stat__"
## [17] "stepsize__" "treedepth__" "n_leapfrog__" "divergent__"
## [21] "energy__"
loyn.draw <- loyn.brm3 %>% gather_draws(`^b_.*`, regex=TRUE)
loyn.draw
loyn.brm3 %>% mcmc_plot(type='intervals')
loyn.brm3 %>% tidy_draws()
loyn.brm3 %>% spread_draws(`^b_.*`, regex=TRUE)
loyn.rstanarm3 %>% posterior_samples() %>% as.tibble()
loyn.brm3 %>% bayes_R2(summary=FALSE) %>% median_hdci
In the frequentist installment of this example, we explored three different approaches to persuing parsimonious models. The first of these was to ‘dredge’ through all possible combinations of predictors and compare the resulting models via AICc. This approach has been widely criticised as it has the potential to yield models that are numerically ‘best’, yet ecologically meaningless.
The second approach was to average together the parameter estimates from many combinations of models so as to yield parameter estimates that are less dependent on the exact combination of predictors included in the model.
The final approach was to apriori propose a small (<10) number of possible candidate models, each of which focusses on a different aspect of the potential underlying ecological responses and see if any of those models are supported by the data. With this approach, the candidate models are likely to be sensible (as they are proposed on the basis of theory rather than purely numbers) and importantly, they are also interpretable. Furthermore, it also promotes the idea that multiple models might each offer important (and different) insights rather than there being a single ‘best’ model.
For this installment, we will adopt the third approach. In the current context of birds in fragmented forests, we could propose the following models:
Note, these are all models that could be proposed prior to even collecting the data and all can be explained ecologically. By contrast, dredging by chance could suggest a model that has a combination of predictors that are very difficult to interpret and explain.
As the above models contain fewer predictors, there is now scope to include interactions.
loyn.rstanarm4a <- update(loyn.rstanarm3, .~scale(log(DIST))*scale(log(LDIST)), diagnostic_file = file.path(tempdir(), "dfa.csv"))
loyn.rstanarm4b <- update(loyn.rstanarm3, .~scale(log(AREA)) * fGRAZE, diagnostic_file = file.path(tempdir(), "dfb.csv"))
loyn.rstanarm4c <- update(loyn.rstanarm3, .~scale(log(AREA)) * fGRAZE * scale(YR.ISOL), diagnostic_file = file.path(tempdir(), "dfc.csv"))
loyn.rstanarm4d <- update(loyn.rstanarm3, .~scale(ALT), diagnostic_file = file.path(tempdir(), "dfd.csv"))
loyn.rstanarm4e <- update(loyn.rstanarm3, .~1, diagnostic_file = file.path(tempdir(), "dfe.csv"))
loo_compare(loo(loyn.rstanarm4a),
loo(loyn.rstanarm4e)
)
## elpd_diff se_diff
## loyn.rstanarm4e 0.0 0.0
## loyn.rstanarm4a -2.6 1.6
loo_compare(loo(loyn.rstanarm4b),
loo(loyn.rstanarm4e)
)
## elpd_diff se_diff
## loyn.rstanarm4b 0.0 0.0
## loyn.rstanarm4e -29.9 7.3
loo_compare(loo(loyn.rstanarm4c),
loo(loyn.rstanarm4e)
)
## elpd_diff se_diff
## loyn.rstanarm4c 0.0 0.0
## loyn.rstanarm4e -19.4 6.6
loo_compare(loo(loyn.rstanarm4d),
loo(loyn.rstanarm4e)
)
## elpd_diff se_diff
## loyn.rstanarm4d 0.0 0.0
## loyn.rstanarm4e -3.6 2.3
bayes_factor(bridge_sampler(loyn.rstanarm4a),
bridge_sampler(loyn.rstanarm4e))
## Iteration: 1
## Iteration: 2
## Iteration: 3
## Iteration: 4
## Iteration: 5
## Iteration: 6
## Iteration: 7
## Iteration: 1
## Iteration: 2
## Iteration: 3
## Iteration: 4
## Estimated Bayes factor in favor of x1 over x2: 0.00009
bayes_factor(bridge_sampler(loyn.rstanarm4b),
bridge_sampler(loyn.rstanarm4e))
## Iteration: 1
## Iteration: 2
## Iteration: 3
## Iteration: 4
## Iteration: 5
## Iteration: 6
## Iteration: 1
## Iteration: 2
## Iteration: 3
## Iteration: 4
## Iteration: 5
## Estimated Bayes factor in favor of x1 over x2: 810742.20791
bayes_factor(bridge_sampler(loyn.rstanarm4c),
bridge_sampler(loyn.rstanarm4e))
## Iteration: 1
## Iteration: 2
## Iteration: 3
## Iteration: 4
## Iteration: 5
## Iteration: 6
## Iteration: 1
## Iteration: 2
## Iteration: 3
## Iteration: 4
## Iteration: 5
## Estimated Bayes factor in favor of x1 over x2: 0.01252
bayes_factor(bridge_sampler(loyn.rstanarm4d),
bridge_sampler(loyn.rstanarm4e))
## Iteration: 1
## Iteration: 2
## Iteration: 3
## Iteration: 4
## Iteration: 5
## Iteration: 1
## Iteration: 2
## Iteration: 3
## Iteration: 4
## Iteration: 5
## Estimated Bayes factor in favor of x1 over x2: 2.00744
loyn.brm4a <- update(loyn.brm3, .~scale(log(DIST))*scale(log(LDIST)), save_pars=save_pars(all=TRUE), refresh=0)
loyn.brm4b <- update(loyn.brm3, .~scale(log(AREA)) * fGRAZE, save_pars=save_pars(all=TRUE), refresh=0)
loyn.brm4c <- update(loyn.brm3, .~scale(log(AREA)) * fGRAZE * scale(YR.ISOL), save_pars=save_pars(all=TRUE), refresh=0)
loyn.brm4d <- update(loyn.brm3, .~scale(ALT), save_pars=save_pars(all=TRUE), refresh=0)
loyn.brm4e <- update(loyn.brm3, .~1, save_pars=save_pars(all=TRUE), refresh=0)
loo_compare(loo(loyn.brm4a),
loo(loyn.brm4e)
)
## elpd_diff se_diff
## loyn.brm4e 0.0 0.0
## loyn.brm4a -2.5 1.6
loo_compare(loo(loyn.brm4b),
loo(loyn.brm4e)
)
## elpd_diff se_diff
## loyn.brm4b 0.0 0.0
## loyn.brm4e -30.2 7.1
loo_compare(loo(loyn.brm4c),
loo(loyn.brm4e)
)
## elpd_diff se_diff
## loyn.brm4c 0.0 0.0
## loyn.brm4e -19.6 6.7
loo_compare(loo(loyn.brm4d),
loo(loyn.brm4e)
)
## elpd_diff se_diff
## loyn.brm4d 0.0 0.0
## loyn.brm4e -3.6 2.3
bayes_factor(loyn.brm4a,
loyn.brm4e)
## Iteration: 1
## Iteration: 2
## Iteration: 3
## Iteration: 4
## Iteration: 5
## Iteration: 6
## Iteration: 1
## Iteration: 2
## Iteration: 3
## Iteration: 4
## Estimated Bayes factor in favor of loyn.brm4a over loyn.brm4e: 0.00018
bayes_factor(loyn.brm4b,
loyn.brm4e)
## Iteration: 1
## Iteration: 2
## Iteration: 3
## Iteration: 4
## Iteration: 5
## Iteration: 1
## Iteration: 2
## Iteration: 3
## Iteration: 4
## Iteration: 5
## Estimated Bayes factor in favor of loyn.brm4b over loyn.brm4e: 5048156.97343
bayes_factor(loyn.brm4c,
loyn.brm4e)
## Iteration: 1
## Iteration: 2
## Iteration: 3
## Iteration: 4
## Iteration: 5
## Iteration: 6
## Iteration: 7
## Iteration: 1
## Iteration: 2
## Iteration: 3
## Iteration: 4
## Estimated Bayes factor in favor of loyn.brm4c over loyn.brm4e: 0.39253
bayes_factor(loyn.brm4d,
loyn.brm4e)
## Iteration: 1
## Iteration: 2
## Iteration: 3
## Iteration: 4
## Iteration: 5
## Iteration: 6
## Iteration: 1
## Iteration: 2
## Iteration: 3
## Iteration: 4
## Iteration: 5
## Estimated Bayes factor in favor of loyn.brm4d over loyn.brm4e: 2.54195
loyn.list = with(loyn, list(AREA = seq(min(AREA), max(AREA), len=100)))
newdata = emmeans(loyn.rstanarm3, ~AREA|fGRAZE, at = loyn.list, type='response') %>%
as.data.frame
head(newdata)
ggplot(newdata, aes(y=response, x=AREA)) +
geom_ribbon(aes(ymin=lower.HPD, ymax=upper.HPD, fill=fGRAZE), alpha=0.3) +
geom_line(aes(color=fGRAZE)) +
theme_bw() +
scale_x_log10() +
scale_y_log10()
spaghetti = emmeans(loyn.rstanarm3, ~AREA|fGRAZE, at = loyn.list, type='response') %>%
gather_emmeans_draws() %>% mutate(Fit=exp(.value))
wch = sample(1:max(spaghetti$.draw), 100,replace=FALSE)
spaghetti = spaghetti %>% filter(.draw %in% wch)
ggplot(newdata) +
geom_line(data=spaghetti, aes(y=Fit, x=AREA, color=fGRAZE,
group=interaction(fGRAZE,.draw)), alpha=0.05) +
geom_line(aes(y=response, x=AREA, color=fGRAZE)) +
theme_bw() +
scale_x_log10() + scale_y_log10()
loyn.list = with(loyn, list(AREA = seq(min(AREA), max(AREA), len=100)))
newdata = emmeans(loyn.rstanarm4b, ~AREA|fGRAZE, at = loyn.list, type='response') %>%
as.data.frame
head(newdata)
ggplot(newdata, aes(y=response, x=AREA)) +
geom_ribbon(aes(ymin=lower.HPD, ymax=upper.HPD, fill=fGRAZE), alpha=0.3) +
geom_line(aes(color=fGRAZE)) +
theme_bw() +
scale_x_log10() +
scale_y_log10()
spaghetti = emmeans(loyn.rstanarm4b, ~AREA|fGRAZE, at = loyn.list, type='response') %>%
gather_emmeans_draws() %>% mutate(Fit=exp(.value))
wch = sample(1:max(spaghetti$.draw), 100,replace=FALSE)
spaghetti = spaghetti %>% filter(.draw %in% wch)
ggplot(newdata) +
geom_line(data=spaghetti, aes(y=Fit, x=AREA, color=fGRAZE,
group=interaction(fGRAZE,.draw)), alpha=0.05) +
geom_line(aes(y=response, x=AREA, color=fGRAZE)) +
theme_bw() +
scale_x_log10() + scale_y_log10()
loyn.list = with(loyn, list(AREA = seq(min(AREA), max(AREA), len=100)))
newdata = emmeans(loyn.brm3, ~AREA|fGRAZE, at = loyn.list, type='response') %>%
as.data.frame
head(newdata)
ggplot(newdata, aes(y=response, x=AREA)) +
geom_ribbon(aes(ymin=lower.HPD, ymax=upper.HPD, fill=fGRAZE), alpha=0.3) +
geom_line(aes(color=fGRAZE)) +
theme_bw() +
scale_x_log10() +
scale_y_log10()
spaghetti = emmeans(loyn.brm3, ~AREA|fGRAZE, at = loyn.list, type='response') %>%
gather_emmeans_draws() %>% mutate(Fit=exp(.value))
wch = sample(1:max(spaghetti$.draw), 100,replace=FALSE)
spaghetti = spaghetti %>% filter(.draw %in% wch)
ggplot(newdata) +
geom_line(data=spaghetti, aes(y=Fit, x=AREA, color=fGRAZE,
group=interaction(fGRAZE,.draw)), alpha=0.05) +
geom_line(aes(y=response, x=AREA, color=fGRAZE)) +
theme_bw() +
scale_x_log10() + scale_y_log10()
loyn.list = with(loyn, list(AREA = seq(min(AREA), max(AREA), len=100)))
newdata = emmeans(loyn.brm4b, ~AREA|fGRAZE, at = loyn.list, type='response') %>%
as.data.frame
head(newdata)
ggplot(newdata, aes(y=response, x=AREA)) +
geom_ribbon(aes(ymin=lower.HPD, ymax=upper.HPD, fill=fGRAZE), alpha=0.3) +
geom_line(aes(color=fGRAZE)) +
theme_bw() +
scale_x_log10() +
scale_y_log10()
spaghetti = emmeans(loyn.brm4b, ~AREA|fGRAZE, at = loyn.list, type='response') %>%
gather_emmeans_draws() %>% mutate(Fit=exp(.value))
wch = sample(1:max(spaghetti$.draw), 100,replace=FALSE)
spaghetti = spaghetti %>% filter(.draw %in% wch)
ggplot(newdata) +
geom_line(data=spaghetti, aes(y=Fit, x=AREA, color=fGRAZE,
group=interaction(fGRAZE,.draw)), alpha=0.05) +
geom_line(aes(y=response, x=AREA, color=fGRAZE)) +
theme_bw() +
scale_x_log10() + scale_y_log10()
Loyn, R. H. 1987. “Nature Conservation: The Role of Remnants of Native Vegetation.” In, edited by D. A. Saunders, G. W. Arnold, A. A. Burbridge, and A. J. M. Hopkins. Chipping Norton, NSW: Surrey Beatty & Sons.